Creating and controlling video-realistic talking heads

نویسندگان

  • Frédéric Elisei
  • Matthias Odisio
  • Gérard Bailly
  • Pierre Badin
چکیده

We present a linear three-dimensional modeling paradigm for lips and face, that captures the audiovisual speech activity of a given speaker by only six parameters. Our articulatory models are constructed from real data (front and profile images), using a linear component analysis of about 200 3D coordinates of fleshpoints on the subject's face and lips. Compared to a raw component analysis, our construction approach leads to somewhat more comparable relations across subjects: by construction, the six parameters have a clear phonetic/articulatory interpretation. We use such a speaker’s specific articulatory model to regularize MPEG-4 facial articulation parameters (FAP) and show that this regularization process can drastically reduce bandwidth, noise and quantization artifacts. We then present how analysis-by-synthesis techniques using the speaker-specific model allows the tracking of facial movements. Finally, the results of this tracking scheme have been used to develop a text-to-audiovisual speech system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Photo-Realistic Talking-Heads from Image Samples

This paper describes a system for creating a photo-realistic model of the human head that can be animated and lip-synched from phonetic transcripts of text. Combined with a state-of-the-art text-to-speech synthesizer (TTS), it generates video animations of talking heads that closely resemble real people. To obtain a naturally looking head, we choose a “data-driven” approach. We record a talking...

متن کامل

Building speaker-specific lip models for talking heads from 3d face data

When creating realistic talking head animations, accurate modeling of speech articulators is important for speech perceptibility. Previous lip modeling methods such as simple numerical lip modeling focus on creating a general lip model without incorporating lip speaker variations. Here we present a method for creating accurate speaker-specific lip representations that retain the individual char...

متن کامل

Dialog-driven video-realistic image-based eye animation

Talking-heads are useful to give a face to multimedia applications such as virtual operators or news readers in dialog systems. However, their great commercial potentials can only become true, if talking-heads are indistinguishable from real recorded videos and at the same time correctly model the human-like behavior. For this, mouth as well as nonverbal behaviors such as head movements, facial...

متن کامل

Sample-Based Synthesis of Photo-Realistic Talking Heads

This paper describes a system that generates photorealistic video animations of talking heads. First the system derives head models from existing video footage using image recognition techniques. It locates, extracts and labels facial parts such as mouth, eyes, and eyebrows into a compact library. Then, using these face models and a text-to-speech synthesizer, it synthesizes new video sequences...

متن کامل

Quality of talking heads in different interaction and media contexts

We investigate the impact of three different factors on the quality of talking heads as metaphors of a spoken dialogue system in the smart home domain. The main focus lies on the effect of voice and head characteristics on audio and video quality, as well as overall quality. Furthermore, the influence of interactivity and of media context on user perception is analysed. For this purpose two sub...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001